Skip to content

Conversation

@danxuliu
Copy link
Member

@danxuliu danxuliu commented Jan 8, 2026

Talk side of nextcloud/live_transcription#54

The live_transcription app now supports live translations, so instead of receiving the transcription in the original message each participant can receive them translated in their desired language.

The target language to be used for translations is conceptually similar to the preferred language to show the UI, so a user setting was added for it. If not set a default language will be used, which will depend on the participant, as it is the best language suggested by L10N/IFactory::findLanguage. Guests do not have user settings, so in that case the setting is locally saved in the client using the browser storage. Independently of whether the participant is a user or a guest the target language can be selected in the app settings.

Note that currently some languages do not show their name, but only the code. The language metadata comes from the live_transcription app, so that needs to be fixed there.

The existing button to enable and disable transcriptions was converted to a split button to choose whether the transcription in the original language or the translation in the target language should be used.

An API endpoint was added to get the translation languages; it basically forwards the data provided by the live_transcription app, although changing the snake_case to lowerCamelCase of the main keys for consistency with other Talk APIs, and adding an additional element for the default target language.

Another API endpoint was added to enable or disable live translations; this requires transcriptions for the participant to be already enabled.

The format of signaling messages of transcriptions and translations is compatible, so the rendered does not need to be adjusted except for getting the metadata of the languages (as a different set of languages is used for transcriptions and translations).

Setup

Besides the live_transcription app note that https://github.com/nextcloud/translate2 (or any other app that provides translations) is needed.

Note that there is currently a bug in one of the live_transcription dependencies; after installing them with pip install -r requirements.txt you would need to manually patch nc_py_api: cloud-py-api/nc_py_api#391

🖌️ UI Checklist

🖼️ Screenshots / Screencasts

🏚️ Before 🏡 After
Live-Transcription-Button-Before Live-Transcription-Button-After

🚧 Tasks

  • Add selector for preferred translation language in app settings
  • Convert live transcription button to a split button to choose between the original transcription or a translated one
  • Translations will be slower than transcriptions, so they could look broken to the user if nothing appears for quite some time; maybe the original transcription could still be shown dimmed/transparent so the user sees that something is happening and then it is replaced by the translated text in full opacity once available - Most likely a follow up once this got some testing with a real server, and this will require changes in the live_transcription app, because once translations are enabled they fully replace transcriptions, so the transcription is no longer available

🏁 Checklist

  • 🌏 Tested with different browsers / clients:
    • Chromium (Chrome / Edge / Opera / Brave)
    • Firefox
    • Safari
    • Talk Desktop
    • Integrations with Files sidebar and other apps
    • Not risky to browser differences / client
  • 🖌️ Design was reviewed, approved or inspired by the design team
  • ⛑️ Tests are included or not possible
  • 📗 User documentation in https://github.com/nextcloud/documentation/tree/master/user_manual/talk has been updated or is not required

🛠️ API Checklist

🚧 Tasks

  • Capability is not fully reliable, as the capabilities of live_transcription app can not be got from Talk. For now it is checked whether the translation tasks are available, but that could return a false positive if the live_transcription app is installed but it is an old version that did not support translations.
  • Better error handling, as right now setting the target language may not always fail with a bad request but an internal error if translations are not available in the live_transcription app
  • Right now if translations are enabled when there is no transcription first the transcriptions are enabled and then the translations are enabled. It would be possible to provide a parameter in the request that enables the transcription to also enable the translation to a specific language and with that save one request to the server. Not sure if it is worth it, though, or if it would be a problem with API versions (although I guess not, as it would be an optional parameter).
  • Should force_language and reduce_to_languages be honoured when returning the list of translation languages?
  • Endpoints might require tweaking based on the UI development

🏁 Checklist

  • ⛑️ Tests (unit and/or integration) are included or not possible
  • 📘 API documentation in docs/ has been updated or is not required
  • 🔖 Capability is added or not needed

@danxuliu danxuliu force-pushed the add-support-for-live-translations-in-calls branch from 6db0732 to 2c47aae Compare January 13, 2026 09:50
@danxuliu danxuliu marked this pull request as ready for review January 13, 2026 10:19
</template>
{{ liveTranscriptionButtonLabel }}
</NcActionButton>
<LiveTranscriptionButton
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On narrow screens, this one should be an NcActionButton...
I see several options:

  • in LiveTranscriptionButton, made two templates:
    • 1x NcButton + NcActions > 2x NcActionsButton (for wide)
    • 3x NcActionsButton (for narrow)
  • in BottomBar, exclude translations from hiding logic
  • in BottomBar (here, L320), remove controls, so they're only available on wide (currently looks like it, but it's a bug)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On narrow screens, this one should be an NcActionButton...

Of course... I do not know what I was thinking on 🤦

I see several options:

* in LiveTranscriptionButton, made two templates:
  * 1x NcButton + NcActions > 2x NcActionsButton (for wide)

Why two action buttons? 🤔

  * 3x NcActionsButton (for narrow)

With the split button the language actions are relative, so to speak, to the transcription button. But I think that three separate actions in the menu for Enable/disable transcriptions, Original language (language name) and Translated language (language name) will look strange (and the language actions a bit out of context). Maybe Enable/disable transcriptions (language name) and Enable/disable translations (language name)?

* in BottomBar, exclude translations from hiding logic

Mmmm, I think that transcriptions/translations are not as important as the other actions there to warrant a "primary" position, so to speak.

* in BottomBar (here, L320), remove controls, so they're only available on wide (currently looks like it, but it's a bug)

I would be against that, I think that being in a narrow screen should not prevent you from having transcriptions or translations.

@nimishavijay Any opinion on all this? :-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why two action buttons?

You have two at the moment, originalLanguageButtonLabel and targetLanguageButtonLabel

Enable/disable *

Sounds good to me. This should be more or less enough then:

Details
	<div
		v-if="<button or actions prop here>"
		class="live-transcription-button-wrapper">
		<NcButton
			:title="liveTranscriptionButtonLabel"
			...
			</NcActionButton>
		</NcActions>
	</div>

	<template v-else>
		<NcActionSeparator />
		<NcActionButton
			:disabled="isLiveTranscriptionLoading"
			@click="toggleLiveTranscription">
			{{ liveTranscriptionButtonLabel }}
		</NcActionButton>
		
		<NcActionButton
			v-if="supportLiveTranslation"
			:disabled="isLiveTranscriptionLoading"
			@click="<handleOriginalLanguageSelected or handleTargetLanguageSelected>">
			{{ t('spreed', 'Enable live translation') or t('spreed', 'Disable live translation') }}
		</NcActionButton>
		<NcActionSeparator />
	</template>

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why two action buttons?

You have two at the moment, originalLanguageButtonLabel and targetLanguageButtonLabel

Ah, sorry, I misunderstood. I thought you meant changing the current button with action buttons to just two action buttons.

Copy link
Member

@nimishavijay nimishavijay Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense! Instead of enable/disabled, what do you think about showing it like youtube: Off, Original language, Translated language ?

image

in BottomBar (here, L320), remove controls, so they're only available on wide (currently looks like it, but it's a bug)

I would be against that, I think that being in a narrow screen should not prevent you from having transcriptions or translations.

Not sure I understood this, does it mean we show it as a regular button, not split button, and it opens the action menu?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in LiveTranscriptionButton, made two templates:

  • 1x NcButton + NcActions > 2x NcActionsButton (for wide)
  • 3x NcActionsButton (for narrow)

Unfortunately that does not seem to work due to a limitation of the library. So I moved back the code to the BottomBar component and removed the separate LiveTranscriptionButton component 🤷

Makes sense! Instead of enable/disabled, what do you think about showing it like youtube: Off, Original language, Translated language ?

For now I just kept the existing button, although that could also be an interesting option. But I do not have a strong preference.

The "problem" would be that, if those three buttons are also shown for consistency in the menu of the bottom bar, it would also need an additional label to mark them as related to transcriptions/translations, as well as probably separators from the other actions. I am not sure if that would look fine or be a bit strange to have so "many" entries for that.

in BottomBar (here, L320), remove controls, so they're only available on wide (currently looks like it, but it's a bug)

I would be against that, I think that being in a narrow screen should not prevent you from having transcriptions or translations.

Not sure I understood this, does it mean we show it as a regular button, not split button, and it opens the action menu?

From what I understood that approach would be just hiding the controls in narrow screens, so they would not be available anywhere. That is why I was against it :-) But I might have misunderstood it :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "problem" would be that, if those three buttons are also shown for consistency in the menu of the bottom bar, it would also need an additional label to mark them as related to transcriptions/translations, as well as probably separators from the other actions. I am not sure if that would look fine or be a bit strange to have so "many" entries for that.

Actually it does not look so bad... Probably the icons should be adjusted, but anyway:
Bildschirmfoto vom 2026-01-15 16-41-31

Nevertheless at this point this should be something for a follow up :-(

Live translations is an optional feature that is only available if the
external app "live_transcription" supports them.

Signed-off-by: Daniel Calviño Sánchez <[email protected]>
Getting the capabilities from the live_transcription app causes the
Nextcloud capabilities to be requested (apparently when getting the
supported task types), so it enters in a loop. For now checking whether
text2text tasks are supported or not is directly done instead in Talk
when getting the capabilities (but that does not guarantee that
translations are supported, as an old live_transcription app might be
being used).

Signed-off-by: Daniel Calviño Sánchez <[email protected]>
The setting is provided in the capabilites.

Signed-off-by: Daniel Calviño Sánchez <[email protected]>
The setting is read from the the capabilities and saved to the user
settings. Guests are not able to save the user setting, so in that case
it is saved to the browser storage.

Signed-off-by: Daniel Calviño Sánchez <[email protected]>
When live translations are enabled the live_transcription app sends
the same type of signaling messages used for live transcriptions, so it
is not possible to distinguish from the message whether the metadata
should be got from the list of transcription languages or the list of
translation target languages. Due to that both lists are checked now for
the language metadata.

Signed-off-by: Daniel Calviño Sánchez <[email protected]>
@danxuliu danxuliu force-pushed the add-support-for-live-translations-in-calls branch from e65e874 to 32b81cc Compare January 15, 2026 13:25
Copy link
Contributor

@Antreesy Antreesy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Frontend part looks good code-wise

@danxuliu danxuliu force-pushed the add-support-for-live-translations-in-calls branch from 32b81cc to 105ed39 Compare January 15, 2026 14:25
@danxuliu danxuliu force-pushed the add-support-for-live-translations-in-calls branch from 105ed39 to c1b8b55 Compare January 15, 2026 15:20
The button now shows the original language of the transcription and the
target language of the translation to switch between them (or directly
enable transcriptions or translations).

Signed-off-by: Daniel Calviño Sánchez <[email protected]>
Signed-off-by: Daniel Calviño Sánchez <[email protected]>
If the transcription is enabled again (from the same button, so it is
only persistent for the same call session) it will directly start with
the translation, without having to explicitly select it.

Signed-off-by: Daniel Calviño Sánchez <[email protected]>
Signed-off-by: Daniel Calviño Sánchez <[email protected]>
When the bottom bar is shown in a narrow screen the split button is not
shown and replaced by actions in the menu. Now an action to enable or
disable live translations was added next to the action to enable or
disable live transcriptions.

Signed-off-by: Daniel Calviño Sánchez <[email protected]>
@danxuliu danxuliu force-pushed the add-support-for-live-translations-in-calls branch from c1b8b55 to 0e15bd7 Compare January 15, 2026 15:33
@danxuliu danxuliu enabled auto-merge January 15, 2026 15:47
@danxuliu danxuliu merged commit e2de153 into main Jan 15, 2026
83 checks passed
@danxuliu danxuliu deleted the add-support-for-live-translations-in-calls branch January 15, 2026 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants